Search CORE

40 research outputs found

View Selection in Semantic Web Databases

Author: François Goasdoué
François Goasdoué
Ioana Manolescu
Julien Leblay
Julien Leblay
Konstantinos Karanasos
Konstantinos Karanasos
Équipes-projets Leo
Publication venue
Publication date: 01/01/2011
Field of study

We consider the setting of a Semantic Web database, containing both explicit data encoded in RDF triples, and implicit data, implied by the RDF semantics. Based on a query workload, we address the problem of selecting a set of views to be materialized in the database, minimizing a combination of query processing, view storage, and view maintenance costs. Starting from an existing relational view selection method, we devise new algorithms for recommending view sets, and show that they scale significantly beyond the existing relational ones when adapted to the RDF context. To account for implicit triples in query answers, we propose a novel RDF query reformulation algorithm and an innovative way of incorporating it into view selection in order to avoid a combinatorial explosion in the complexity of the selection process. The interest of our techniques is demonstrated through a set of experiments.Comment: VLDB201

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Rennes 1

RDFViewS: A Storage Tuning Wizard for RDF Applications

Author: Goasdoué François
Karanasos Konstantinos
Leblay Julien
Manolescu Ioana
Publication venue
Publication date: 01/01/2010
Field of study

In recent years, the significant growth of RDF data used in numerous applications has made its efficient and scalable manipulation an important issue. In this paper, we present RDFViewS, a system capable of choosing the most suitable views to materialize, in order to minimize the query response time for a specific SPARQL query workload, while taking into account the view maintenance cost and storage space constraints. Our system employs practical algorithms and heuristics to navigate through the search space of potential view configurations, and exploits the possibly available semantic information - expressed via an RDF Schema - to ensure the completeness of the query evaluation

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Rennes 1

Computational Fact Checking: A Content Management Perspective

Author: Cazalens Sylvie
Lamarre Philippe
Leblay Julien
Manolescu Ioana
Tannier Xavier
Publication venue: 'VLDB Endowment'
Publication date: 26/08/2018
Field of study

International audienc

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

SPARQL query answering with bitmap indexes

Author: Leblay Julien
Publication venue: HAL CCSD
Publication date: 20/05/2012
Field of study

International audienceWhen querying RDF data, one may use reasoning to reach intensional data, i.e., data defined by sets of rules. This is usually achieved through forward chaining, with space and maintenance overheads, or backward chaining, with high query evaluation and optimization costs. Recent approaches rely on pre-computing the terminological closure of the data rather than the full saturation. In this setting, one can even query the data without resorting to backward chaining, using a so-called semantic index. However, these techniques are limited in the type of queries they can support. In this paper, we introduce a data storage technique which mitigates the space issues of forward-chaining. We show that it can also be used with a semantic index. We propose a new structure for the index that relies on bitmaps making it resilient to updates. Our experimental results demonstrate that our storage model significantly reduces the space required to store the data. We show that the indexes can be computed quickly and fit well in memory even for very large ontologies. Finally, we analyze how query answering is affected by the data layout

HAL-CentraleSupelec

Techniques d'optimisation pour des données semi-structurées du web sémantique

Author: Leblay Julien
Publication venue: HAL CCSD
Publication date: 27/09/2013
Field of study

Since the beginning of the Semantic Web, RDF and SPARQL have become the standard data model and query language to describe resources on the Web. Large amounts of RDF data are now available either as stand-alone datasets or as metadata over semi-structured documents, typically XML. The ability to apply RDF annotations over XML data emphasizes the need to represent and query data and metadata simultaneously. While significant efforts have been invested into producing and publishing annotations manually or automatically, little attention has been devoted to exploiting such data. This thesis aims at setting database foundations for the management of hybrid XML-RDF data. We present a data model capturing the structural aspects of XML data and the semantics of RDF. Our model is general enough to describe pure XML or RDF datasets, as well as RDF-annotated XML data, where any XML node can act as a resource. We also introduce the XRQ query language that combines features of both XQuery and SPARQL. XRQ not only allows querying the structure of documents and the semantics of their annotations, but also producing annotated semi-structured data on-the-fly. We introduce the problem of query composition in XRQ, and exhaustively study query evaluation techniques for XR data to demonstrate the feasibility of this data management setting. We have developed an XR platform on top of well-known data management systems for XML and RDF. The platform features several query processing algorithms, whose performance is experimentally compared. We present an application built on top of the XR platform. The application provides manual and automatic annotation tools, and an interface to query annotated Web page and publicly available XML and RDF datasets concurrently. As a generalization of RDF and SPARQL, XR and XRQ enables RDFS-type of query answering. In this respect, we present a technique to support RDFS-entailments in RDF (and by extension XR) data management systems.RDF et SPARQL se sont imposés comme modèle de données et langage de requêtes standard pour décrire et interroger les données sur la Toile. D’importantes quantités de données RDF sont désormais disponibles, sous forme de jeux de données ou de méta-données pour des documents semi-structurés, en particulier XML. La coexistence et l’interdépendance grandissantes entre RDF et XML rendent de plus en plus pressant le besoin de représenter et interroger ces données conjointement. Bien que de nombreux travaux couvrent la production et la publication, manuelles ou automatiques, d’annotations pour données semi-structurées, peu de recherches ont été consacrées à l’exploitation de telles données. Cette thèse pose les bases de la gestion de données hybrides XML-RDF. Nous présentons XR, un modèle de données accommodant l’aspect structurel d’XML et la sémantique de RDF. Le modèle est suffisamment général pour représenter des données indépendantes ou interconnectées, pour lesquelles chaque nœud XML est potentiellement une ressource RDF. Nous introduisons le langage XRQ, qui combine les principales caractéristiques des langages XQuery et SPARQL. Le langage permet d’interroger la structure des documents ainsi que la sémantique de leurs annotations, mais aussi de produire des données semi-structurées annotées. Nous introduisons le problème de composition de requêtes dans le langage XRQ et étudions de manière exhaustive les techniques d’évaluation de requêtes possibles. Nous avons développé la plateforme XRP, implantant les algorithmes d’évaluation de requêtes dont nous comparons les performances expérimentalement. Nous présentons une application reposant sur cette plateforme pour l’annotation automatique et manuelle de pages trouvées sur la Toile. Enfin, nous présentons une technique pour l’inférence RDFS dans les systèmes de gestion de données RDF (et par extension XR)

INRIA a CCSD electronic archive server

Database techniques for semantics-rich semi-structured Web data

Author: Leblay Julien
Publication venue
Publication date: 27/09/2013
Field of study

RDF et SPARQL se sont imposés comme modèle de données et langage de requêtes standard pour décrire et interroger les données sur la Toile. D’importantes quantités de données RDF sont désormais disponibles, sous forme de jeux de données ou de méta-données pour des documents semi-structurés, en particulier XML. La coexistence et l’interdépendance grandissantes entre RDF et XML rendent de plus en plus pressant le besoin de représenter et interroger ces données conjointement. Bien que de nombreux travaux couvrent la production et la publication, manuelles ou automatiques, d’annotations pour données semi-structurées, peu de recherches ont été consacrées à l’exploitation de telles données. Cette thèse pose les bases de la gestion de données hybrides XML-RDF. Nous présentons XR, un modèle de données accommodant l’aspect structurel d’XML et la sémantique de RDF. Le modèle est suffisamment général pour représenter des données indépendantes ou interconnectées, pour lesquelles chaque nœud XML est potentiellement une ressource RDF. Nous introduisons le langage XRQ, qui combine les principales caractéristiques des langages XQuery et SPARQL. Le langage permet d’interroger la structure des documents ainsi que la sémantique de leurs annotations, mais aussi de produire des données semi-structurées annotées. Nous introduisons le problème de composition de requêtes dans le langage XRQ et étudions de manière exhaustive les techniques d’évaluation de requêtes possibles. Nous avons développé la plateforme XRP, implantant les algorithmes d’évaluation de requêtes dont nous comparons les performances expérimentalement. Nous présentons une application reposant sur cette plateforme pour l’annotation automatique et manuelle de pages trouvées sur la Toile. Enfin, nous présentons une technique pour l’inférence RDFS dans les systèmes de gestion de données RDF (et par extension XR).Since the beginning of the Semantic Web, RDF and SPARQL have become the standard data model and query language to describe resources on the Web. Large amounts of RDF data are now available either as stand-alone datasets or as metadata over semi-structured documents, typically XML. The ability to apply RDF annotations over XML data emphasizes the need to represent and query data and metadata simultaneously. While significant efforts have been invested into producing and publishing annotations manually or automatically, little attention has been devoted to exploiting such data. This thesis aims at setting database foundations for the management of hybrid XML-RDF data. We present a data model capturing the structural aspects of XML data and the semantics of RDF. Our model is general enough to describe pure XML or RDF datasets, as well as RDF-annotated XML data, where any XML node can act as a resource. We also introduce the XRQ query language that combines features of both XQuery and SPARQL. XRQ not only allows querying the structure of documents and the semantics of their annotations, but also producing annotated semi-structured data on-the-fly. We introduce the problem of query composition in XRQ, and exhaustively study query evaluation techniques for XR data to demonstrate the feasibility of this data management setting. We have developed an XR platform on top of well-known data management systems for XML and RDF. The platform features several query processing algorithms, whose performance is experimentally compared. We present an application built on top of the XR platform. The application provides manual and automatic annotation tools, and an interface to query annotated Web page and publicly available XML and RDF datasets concurrently. As a generalization of RDF and SPARQL, XR and XRQ enables RDFS-type of query answering. In this respect, we present a technique to support RDFS-entailments in RDF (and by extension XR) data management systems

Theses.fr

Techniques d'optimisation pour des données semi-structurées du web sémantique

Author: Leblay Julien
Publication venue: HAL CCSD
Publication date: 27/09/2013
Field of study

HAL-CentraleSupelec

Thèses en Ligne

INRIA a CCSD electronic archive server

A Declarative Approach to Data-Driven Fact Checking

Author: Leblay Julien
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 10/02/2017
Field of study

Fact checking is an essential part of any investigative work. For linguistic, psychological and social reasons, it is an inherently human task. Yet, modern media make it increasingly difficult for experts to keep up with the pace at which information is produced. Hence, we believe there is value in tools to assist them in this process. Much of the effort on Web data research has been focused on coping with incompleteness and uncertainty. Comparatively, dealing with context has received less attention, although it is crucial in judging the validity of a claim. For instance, what holds true in a US state, might not in its neighbors, e.g., due to obsolete or superseded laws. In this work, we address the problem of checking the validity of claims in multiple contexts. We define a language to represent and query facts across different dimensions. The approach is non-intrusive and allows relatively easy modeling, while capturing incompleteness and uncertainty. We describe the syntax and semantics of the language. We present algorithms to demonstrate its feasibility, and we illustrate its usefulness through examples

Association for the Advancement of Artificial Intelligence: AAAI Publications

SPARQL query answering with bitmap indexes

Author: Julien Leblay
Publication venue: Scottsdale‚ AZ‚ United States
Publication date: 01/01/2012
Field of study

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Rennes 1